Convergence research and training in computational bioengineering: a case study on AI/ML driven biofilm-material interaction discovery

Initially, research disciplines operated independently, but the emergence of trans-disciplinary sciences led to convergence research, impacting graduate programs and research laboratories, especially in bioengineering and material engineering as presented here. Current graduate curriculum fails to efficiently prepare students for multidisciplinary and convergence research, thus creating a gap between the students and research laboratory expectations. We present a convergence training framework for graduate students, incorporating problem-based learning under the guidance of senior scientists and collaboration with postdoctoral researchers. This case study serves as a template for transdisciplinary convergent training projects - bridging the expertise gap and fostering successful convergence learning experiences in computational biointerface (material-biology interface). The 18-month Advanced Data Science Workshop, initiated in 2019, involves project-based learning, online training modules, and data collection. A pilot solution utilized Jupyter notebook on Google collaborator and culminated in a face-to-face workshop where project presentations and finalization occurred. The program started with 9 experts in the four diverse fields creating 14 curated projects in data science (Artificial Intelligence/Machine Learning), material science, biofilm engineering, and biointerface. These were integrated into convergence research through webinars by the experts. The experts chose 8 of the 14 projects to be part of an all-day in-person workshop, where over 20 learners formed eight teams that tackled complex problems at the interface of digital image processing, gene expression analysis, and material prediction. Each team was comprised of students and postdoctoral researchers or research scientists from diverse domains including computer science, materials science, and biofilm research. Some projects were selected for presentation at the international IEEE Bioinformatics conference in 2022, with three resulting Machine Learning (ML) models submitted as a journal paper. Students engaged in problem discussions, collaborated with experts from different disciplines, and received guidance in decomposing learning objectives. Based on learner feedback, this successful experience allows for consolidation and integration of convergence research via problem-based learning into the curriculum. Three bioengineering participants, who received training in data science and engineering, have received bioinformatics jobs in biotechnology industries.

material prediction.Each team was comprised of students and postdoctoral researchers or research scientists from diverse domains including computer science, materials science, and bio lm research.Some projects were selected for presentation at the international IEEE Bioinformatics conference in 2022, with three resulting Machine Learning (ML) models submitted as a journal paper.Students engaged in problem discussions, collaborated with experts from different disciplines, and received guidance in decomposing learning objectives.Based on learner feedback, this successful experience allows for consolidation and integration of convergence research via problem-based learning into the curriculum.Three bioengineering participants, who received training in data science and engineering, have received bioinformatics jobs in biotechnology industries.

Challenge Statement
We posited the question of whether bioengineering students, particularly graduate students, were prepared for laboratory work in convergence science?The convergence science approach combines multiple elds to solve a problem where teammates may come from very distinct elds (e.g., computer science, biological sciences, engineering) [1].The rise of transdisciplinary sciences has exposed skill gaps among graduate students.The students are entering the workforce from laboratories that likely did not participate in convergent research and lack the skills to communicate across diverse research disciplines.This gap in their education prompted us to adopt a dual strategy: utilizing a convergence science focus with project-based learning.
We present a specialized convergence framework designed for graduate students to bridge this gap.This project-based learning model uses the ve rst principles from Merrill 2002 [2] Fig. 1.This innovative approach intertwines problem-based learning with seasoned scientists' mentorship and postdoctoral researchers' collaboration utilizing the convergence research approach in each of the ve areas (Fig. 1).
By following this model, we provide a blueprint for other transdisciplinary projects.This initiative successfully bridges the expertise gap and nurtures effective convergence learning experiences within the realm of computational biointerface (material-biology interface).This guidance by experts in the eld allows students' previous experience to guide their input as they develop transdisciplinary skills.

Novel Initiative
Our case study in this manuscript addresses the unpreparedness for multidisciplinary and convergent research of bioengineering graduate students that results in a gap between their skills and research laboratory expectations.The aim is to establish a comprehensive synthetic framework capable of addressing intricate scienti c and societal challenges at the crossroads of diverse domains.
Our framework (Fig. 2) involved assembling experts across three disciplines -microbiology, material sciences with a focus on 2D materials, and big data -to engage in interdisciplinary ventures (Fig. 1).This collaboration facilitated productive dialogues between computer science and biology students, enhancing mutual comprehension and enriching their understanding of each other's elds.To bolster competencies, we embraced portions of the Pedagogy-Andragogy-Heutagogy (PAH) continuum [3] -a combination methodology empowering learners to use rsthand experiences for deeper understanding while having a locus of control that is still teacher -learner with the expert guiding the team, -focusing on problem-solving into a graduate education convergence framework.Culminating in the students' participation in transdisciplinary projects.This solution centers on enhancing students' aptitude for interdisciplinary teamwork and aligning with research laboratory standards.By integrating 2D material science with computational biointerface, we introduced a convergence framework for graduate students.This framework, guided by senior scientists and postdoctoral researchers, embraces problem-based learning and the rst principles of instruction by Merrill (2002).

Convergence Problem and Team Science
The framework, which we referred to as the DDMD (Data Driven Material Discovery) Advanced Data Science Workshop, incorporates Merrill's rst ve principles [2].The complete program spanned 18 months (about 1 and a half years) and included numerous meetings, workshops, and dedicated o ce hours (Fig. 2).The framework combined team projects, project-based learning, online training modules, and data collection for workshop improvement, utilizing surveys during each step.Through this platform, we engaged their existing knowledge with short courses, certi cates, and mentorship, which cultivated collaboration, to enhance the second principle.We took students from their previous knowledge and expanded their knowledge to improve their interdisciplinary skillset.We promoted learning by enabling participants to demonstrate, apply, and integrate their new knowledge [2].This was done by assuring that objectives were aligned across the discipline skills learned [4].

Mini-Capstone Projects and Expert selection and Group Selection
Merrill's rst of the rst ve principles (2002) asserts that problems pertinent to the real-world are most impactful to effective learning.We used this to our advantage when developing the mini-capstones for this framework.The program started with 9 experts in four elds creating 14 curated projects in data science (Arti cial Intelligence/Machine Learning), material science, bio lm engineering, and biointerface (Table 1).These were integrated into convergence research through webinars by the experts.Participants, including experts from diverse elds, formed teams and tackled subjects such as digital image processing, gene expression analysis, and material prediction.1), the experts, working with the participants, selected eight to be included in the future all-day in-person workshop, based on their interest.During this event, more than 20 learners formed eight teams to tackle topics such as digital image processing, gene expression analysis, and material prediction.Each team was composed of students and a postdoctoral researcher or scientist representing each related scienti c eld (computer, material, bio lm).Organizing the groups this way facilitated a transition from initial skillsets to a convergence skillset.
Online Training with Short Courses, O ce Hours, and Pre-Workshop Meeting From the selected eight projects, the eld experts determined the base knowledge required by the participants and curated short courses accordingly.Each short course included topic speci c learning objectives.This allowed us to activate participants' prior knowledge as a foundation for new knowledge (Merrill's second principle, Merrill 2002).The experts used Jupyter Notebook and Google Colab to reduce barriers of access for the participants.The short courses were housed in a Google Classroom for the graduate and undergraduate students to go through at their own pace.Each mini-capstone project had a project outline document describing the science problem followed by computational tasks to address them.The participant also had access to a template notebook as project programming playground on toy dataset (S1.zip).An orientation workshop to these modules allowed the students to choose theprojects to join for the remaining workshops and o ce hours.
Our short courses consisted of nine topics, and we kept the courses available to the participants for 6 months after the workshop conclusion.These courses gave the participants background knowledge of the topics within all the projects.Given the convergent nature of these projects, it was essential for the groups to meet and collaboratively address questions.Therefore, the small groups met twice during the interim times between workshops.

Certi cates and Short Course Workshop
Merrill's third principle is that to promote learning new knowledge must be demonstrated by the learners.Our experts implemented certi cates for participants that nished their courses.The certi cates obtained from the short courses demonstrated the expansion of the student's existing knowledge, allowing students to improve their skillset through the acquisition of new interdisciplinary subjects.

In-Person Workshop, Group Work, and Presentations
The comprehensive process of project creation, team formation, equipping participants with skillsets for project engagement, culminated in an in-person workshop where project presentations and nalization occurred.This approach facilitated ongoing leaning for participants, as promoted through application of new knowledge, Merrill's fourth principle [2].
For our nal meeting we had an in-person workshop, with attendees able to join online viz Zoom.First half of the day one discipline type presented their projects to the whole workshop.Subsequently, we allocated time for collaborative group work, during which time students presented their projects' status.
Upon completion of this workshop, students had opportunities to ask questions and continue working on their projects with their teams.

Artifacts and Conference(s)
This framework produced artifacts from each mini-capstone project.The various projects had artifacts produced during the 18 months, with IEEE presentations in both 2021 and 2022 conference proceedings [5][6][7][8][9][10][11][12][13].Performance in the production of these artifacts (presentations, machine learning models) allowed the participants to integrate their new knowledge (Merrill's 5th principle) [2] into a presentation at the conference.Three machine learning models were created.
We distributed via email pre and post workshop forms, as well as a small group form meant for participants of the o ce hours.We asked questions pertaining to relevant convergence team building data (e.g.their DDMD role (academic position), previous experience with various skills, goals) and we asked for feedback on what participants thought what went well and desired improvements.Out of our 22 participants, we had various amounts of responses to the surveys given, as detailed in the re ection section.

Short Course Workshop
Following the short course workshop participants rated (1 to 4, 1 low and 4 high) the quality of six workshop facets via survey.The following table displays frequency distributions of recorded ratings and mean ratings (Table 2).The helpfulness of the information about the workshop that you received prior to today.

3 3 7
The overall organization of today's event.
3.61 1 3 9 The use of your time today.

5 8
The clarity of expectations of workshop participants.

2 2 9
The opportunities today that were provided for participants to ask questions.

1 12
The process for assigning the teams for the capstone projects.

1 2 9
Participants appreciated the short course workshop for its comprehensive information and organized approach.The overviews of various tasks proved invaluable for navigating the upcoming month, and the structured presentation of separate projects was particularly enlightening.Learning about module expectations and the mini-capstone project's scope eased concerns and heightened enthusiasm for the course.The professors' consistent reassurance, encouragement of questions, and ongoing support were highly valued, as was the chance to delve into machine learning content and work on a capstone project.The workshop's well-structured divisions and implementation sparked excitement for upcoming minicapstone projects and advanced data science exploration.The Python and machine learning modules were singled out as exceptional resources, especially appreciated by those seeking practical skills.The workshop was well received but we did nd comments that the activation activities (Fig. 1) could have been shorter.

Small Groups and O ce Hours
To gauge the effectiveness of our small groups' work we surveyed the participants.The data shows good outcomes, but due to the small size of the group, the results do not allow robust interpretation (n = 4).We will explore this dimension in our future cohorts.

In-person Workshop in Montana
This experience culminated in an in-person workshop in Montana where we started with capstone presentations and then gave the student participants the opportunity to present the project's computer science background.We had participants in person and online via Zoom (Fig. 3).We found it was easy to portray the skill information via Zoom since most of the participation was through the short courses.
There were a total of 22 participants in the mini-capstones that responded to a survey.The overall quality ratings are displayed below disaggregated by the ve DDMD (Data Driven Material Discovery) participation roles (Table 4).Recorded ratings appeared to be independent of DDMD role.The mean quality rating was 9.26.There were no responses lower than 7; therefore, we omitted them in Table 4.The October 19th workshop had ve learning objectives as seen in Table 5. Participants rated the extent (1 to 5, 1 = not at all, 2 = limited extent, 3 = moderate extent, 4 = substantial extent and 5 = exceptional extent) to which they felt each of the ve objectives were attained.Mean ratings and frequency distributions of recorded ratings are displayed in the following table for each of the ve objectives.Explore the cutting-edge of modern biology and material informatics tools, including machine learning, data analysis and visualization, and molecular/multiscale modeling.

4 7 10
Learn how to ne-tune general-purpose models for bioscience and material applications.Learn how to work with small, sparse, or low-quality datasets and build predictive models.

4 6 12
Deepen your knowledge of the frontiers of data-driven MOICs and material analysis and ready-to-deploy code solutions.
4.05 1 6 6 9 Learn computational methods and codes for building better materials, such as language models, protein models and graph neural networks, and how to build and use your own custom datasets.

1 3 4 13
Respondents rated the quality (1 to 4, 1 = low quality and 4 = high quality) of four workshop facets."I didn't do this" was an available response which two respondents selected for "consultations with experts."Mean ratings and numbers of recorded ratings are displayed in the following table (Table 6).The mini capstone project presentations 3.67 7 14 The programming material presentation 3.67 7 14 The group project 3.71 6 15 Consultations with experts 3.74 1 3 15 Participants highlighted several workshop strengths, including bridging the gap between biology and computation, promoting collaboration across disciplines, fostering communication and involvement, facilitating interaction between subject matter experts and data scientists, emphasizing the multidisciplinary nature of the workshop, providing high-quality instructions with active participant engagement, offering insights into various capstone projects, showcasing state-of-the-art technologies relevant to participants' future professions, and creating a comfortable and collaborative environment conducive to learning and questioning.The workshop's value was also recognized in terms of understanding the role of machine learning in addressing diverse challenges and the opportunity for oneon-one and group discussions.Based on participants' views on the strength of this workshop we accomplished our goal of promoting collaboration and the students' skillset in interdisciplinary convergence science.

Feedback on Workshop Improvement
Participants provided insightful suggestions for improving the workshop experience.Two respondents advocated for in-person attendance to address time management issues linked to virtual participation (proposed image montana Fig. 3).Another recommended increased engagement for online participants and improved audio/visual components to enhance both in-person and virtual aspects.Four participants sought more time for team collaboration, discussions, and programming tasks.Other suggestions included adhering more closely to the agenda, offering greater background on projects and coding possibilities, improving presentation organization, and providing more meaningful data for machine learning training.Reducing the number of projects was proposed to enable deeper exploration, using more robust datasets.
For summative evaluation purposes, 21 respondents retrospectively recorded pre-and post-workshop participation ratings from 1 to 4 (1 = low and 4 = high) of their understanding of what it takes (work ow) to harvest data from a variety of sources to address a speci c question.
The paired dependent-t test was used to test for pre/post mean differences at the 0.05-level of signi cance with null hypothesis: difference = 0 and alternative hypothesis: difference < > 0. Mean rating ( ), standard deviation (SD), matched pair dependent t-statistic (t), p-value (p), the correlation between the matched pair ratings, and effect size 1 are displayed in the following table.The increase in mean ratings of understanding from pre-to post-was statistically signi cant (t = 3.35, p < 0.0032) and the effect size was medium (0.72).We performed analysis to examine the participants' view of their understanding of what it takes to harvest data from a variety of sources to address a speci c question.pre-and post-workshop (Fig. 4).The deep red bars represent pre-workshop ratings and black bars represent post-workshop ratings.

Pre-Post Ratings Work ow Understanding
The outcome exceeded expectations: some projects earned slots for presentation at the international IEEE Bioinformatics conference in 2022 [5][6][7][8], yielding three Machine Learning models published in a journal [9][10][11][12][13][14][15][16][17][18].Feedback from participants a rmed the success of this experience, underlining the potential for integrating convergence research into the curriculum through problem-based learning.Moreover, the initiative yielded tangible results, with bioengineering graduate students, trained in data science and engineering, securing positions in the biotechnology sector.
Table 8 Summary process Below are the steps to implement in Undergraduate and Graduate Bioengineering studies, summer project planning and completion taking about 18 months (Fig. 5).
Choose and Postdoctoral Researchers (the experts) to lead projects Step 2.
Initiate a brainstorming session and allow the experts their projects Step 3.
Each expert creates baseline material and puts in a curated Google Classroom Step 4.
Have a conference call to orient students in the Google Classroom Step 5.
Give the students time to participate on the Google Classroom Step 6.
Combine students in their chosen projects via a skills workshop Step 7.

Hold o ce hours between workshops
Step 8.
Have conference where students present as the 'other' discipline Step 9.
Students complete their mini capstones, creating an artifact that they can present at a conference The success of our framework in fostering convergence science teams to address transdisciplinary challenges holds promise.Implementing this model in both undergraduate and graduate studies could shape the next generation of researchers.By championing collaboration, problem-solving, and integration of convergent research methodologies, we pave the way to bridge knowledge gaps, prepare our students for convergence science and propel scienti c innovation.Declarations Each project was required to have 3 o ce hours led by the project expert.A 2-hour o ce hour introduction to the project and a subsequent 1-hour work time and question/answer session o ce hour were held.

Figures
Figures

Figure 4 Participants
Figure 4

Table 2
Quality of Six Workshop Facets

Table 4 DDMD Role Table 4 :
Ratings of the Quality of the Capstone Project Experience

Table 5
Extent of the Attainment of Workshop Learning Objectives